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FOtENOBD 


The  Aerospace  Medical  Panel  of  AGABD  established  a  working  group  (AMP-WG-08)  on  "Evaluation  of 
Methods  to  Assess  Workload"  In  the  fall  of  1976  following  approval  by  the  National  Board  of  Delegates. 
Working  group  Meetings  were  held  at  Cologne  (April  1977) ,  London  (October  1977),  Port  tucker,  Alabama 
(May  1978),  and  Paris  (November  1978)  concurrent  with  settings  of  the  panel.  A  aultl-autbor  report  was 
prepared  and  published  as  an  ACABDograph  (AC-246,  "Survey  of  Methods  to  Assess  Workload")  In  August  1979. 
That  docuaent  contained  19  chapters,  which  can  be  viewed  graphically  as  follows: 


CONCEPTS 


This  technical  evaluation  report  will  look  across  the  19  chapters  displayed  above,  with  the  goal  of 
providing  a  critique  and  overview. 
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I.  INTRODUCTION 


There  are  few  members  of  Che  several  AGARD  Panels  who  do  not  have  strong  Interest,  firm  opinions,  and 
frequently  practical  experience  In  problems  of  pilot  workload.  It  Is  an  area  of  multidisciplinary  concern 
and  activity.  Reports,  papers,  symposia,  working  groups  are  as  likely  to  come,  for  example,  from  the 
Plight  Mechanics  Panel  or  the  Avionics  Panel,  as  from  the  Aerospace  Medical  Panel.  It  Is  Important, 
therefore,  to  set  the  stage  for  this  report.  Moat  of  the  contributors  to  AGARDograph  No.  246,  "Survey  of 
Methods  to  Assess  Workload,”  had  something  to  say  on  this  issue.  Consider  the  following  quotations  from 
that  AGARDograph. 

Chapter  1:  "In  ordinary  uncritical  discourse,  the  phenomena  referred  to  by  the  terms 

"pilot  workload"  and  "fatigue"  are  easily  distinguished.  In  Its  broadest 
and  simplest  aspect,  pilot  workload  refers  to  how  much  a  pilot  must  do  to 
perform  a  specified  flight  operation.  Fatigue  Is  widely  understood  as  a 
feeling  of  tension  or  weariness,  often  accompanied  by  an  obvious  unwilling¬ 
ness  or  Inability  to  continue  to  work  or  perform.  However,  when  attempts 
are  made  to  quantify  the  workload  Imposed  on  a  pilot  by  a  particular  air¬ 
craft  design,  or  operational  procedure,  or  to  assess  the  effects  of 
fatigue  upon  system  performance,  Important  unresolved  Issues  arise  In 
regard  to  the  more  precise  specification  of  workload  and  fatigue  concepts 
and  to  the  adequacy  of  assessment  criteria  and  techniques.” 

Chapter  2:  "Welford  (1953)  .  .  .  would  agree  that  fatigue  Is  a  consequence  or  con¬ 

comitant  of  workload." 

Chapter  3:  "Mission  and  operational  requirements  present  the  modern  pilot  and  crew 

with  ever-changing  complex  tasks  which  provide  another  form  of  stress. 

These  major  sources  of  aircrew  stress  are  compounded  by  the  Individual's 
Internal  psychophyslologic  reaction  to  stress  ..." 

Chapter  4:  "It  would  certainly  be  interesting  and  Important  If  it  were  possible  to 

define  the  degree  and  limits  of  psychophysical  workload  by  means  of 
technically  valid  .  .  .  differential  qualitative  and  quantitative  assess¬ 
ments  of  the  various  flying  specializations.  In  fact,  numerous  methods 
have  been  proposed  periodically  for  obtaining  a  measure  of  workload  by 
quantitatively  evaluating  the  functional  changes  that  fatigue  can  produce." 

Chapter  5:  "It  is  Important  to  recognize  that  the  physiological  mechanisms  of  the 

organism  do  not  particularly  care  nor  are  they  necessarily  aware  that  they 
are  reacting  to  the  effects  of  workload,  the  effects  of  fatigue,  or  the 
effects  of  stress.  Physiological  mechanisms  provide  a  link  between  the 
concepts  of  workload,  fatigue,  and  stress.” 

Chapter  6:  "The  term  workload  Is  a  somewhat  ambiguous  concept  that  can  be  defined  in 

many  ways.  He  feel  that  workload  encompasses  the  concepts  of  performance, 
fatigue,  and  stress,  any  one  of  which  can  be  defined  In  terms  of  the  other." 

Chapter  7 :  "When  one  reviews  the  research  literature  pertaining  to  mental  workload , 

two  conclusions  are  readily  apparent.  Namely,  there  Is  no  single,  agreed 
upon  definition  of  mental  workload,  and  there  Is  no  single,  universal 
metric  of  it.  Mental  workload  la  a  theoretical  construct,  and  as  such, 
might  best  be  defined  operationally.  Clearly,  it  is  related  to  factors 
such  as  operator  stress  and  effort,  but  these  concepts  also  require 
operational  definitions.  Relslng  (1972)  provides  an  excellent  overview 
of  the  difficulties  and  complexities  Involved  in  defining  and  measuring 
workload.  Rather  than  provide  a  single  definition,  one  must  consider  the 
various  operational  definitions  used  in  measuring  operator  mental  work¬ 
load.  The  systems  engineer,  for  example,  may  emphasize  operational 
definitions  based  on  time  available  to  perform  a  task.  Psychologists  tend 
to  emphasize  the  Information  processing  aspects  of  mental  workload  and 
operationally  define  It  In  terms  of  measures  related  to  channel  capacity 
and  residual  attention.  Physiologists,  on  the  other  hand,  emphasize  con¬ 
siderations  of  operator  stress  and  arousal." 

Chapter  8:  "The  principal  objectives  of  a  supportive  workload  research  and  develop¬ 

ment  program  should  be  (1)  establshment  of  a  set  of  theoretlcally- 
consistent  component  functions  descriptive  of  the  performance  of  crew 
members  in  relevant  system  tasks;  (2)  development  of  quantitative 
(mathematical)  expressions  of  relationships  between  input-output 
parameters  for  the  component  functions  and  appropriate  combinations 
thereof;  (3)  integration  of  the  results  of  (1)  and  (2)  above  Into  a  task 
analytic/computer  modeling  methodology;  and  (4)  validation  of  the 
analytlc/predlctlve  methodology  in  a  system  design,  development,  and  test 
effort." 

Chapter  9:  "A  central  goal  of  a  military  workload  analyst  Is  to  understand  the  deter¬ 

minants  of  mission  success  In  a  military  setting.  The  emphasis  Is  on  the 
human  determinants  of  mission  success  with  particular  consideration  to  how 
the  human  uses  the  system  he  is  given  to  accomplish  the  mission  at  hand. 

In  quantitative  workload  analysis,  the  final  goal  in  many  Instances  Is  to 
provide  various  numerical  measures  of  mission  performance.  .  .  A  workload 


analyst  studies  the  system  under  consideration  to  determine  its  capabili¬ 
ties  and.  when  appropriate,  he  designs  system  changes  or  modifications 
with  a  view  to  improving  system  performance." 


Chspter  10:  "Operator  workload  for  the  task  of  vehicle  manipulation  perhaps  could 

be  defined  as  the  sum  of  and  cognitive  processes.  Sensory  inputs  to 
the  operator  are  utilised  to  direct  control  manipulation,  obtain 
feedback  as  to  degree  of  effectiveness  of  the  control  movements,  and 
to  monitor  system  status.  This  input  workload  Is  combined  with  the 
psychomotor  workload  required  to  move  the  vehicle  controls  as  dictated 
from  the  sensory  Inputs  and  feedback  modes.  More  simply  stated,  work¬ 
load  measurements  can  be  derived  by  objectively  measuring  the  input 
and/or  output  of  the  operator." 

Chapter  11:  "The  important  and  close  relationship  between  aircraft  handling  qual¬ 

ities  and  pilot  workload  has  been  underlined  by  several  authors." 

Chapter  12:  "Perhaps  more  progress  has  been  made  toward  the  utilization  of  brain 

wave  Information  for  the  enhancement  of  pilot  performance  In  the  area 
of  monitoring  and  assessment  of  workload  than  in  any  other  area  .  .  . 
(to  achieve  acceptable  pilot  performance)  .  .  .  The  available  resources 
must  be  sufficient  to  meet  the  demands  Imposed  by  all  tasks  which  chal¬ 
lenge  the  operator  at  any  time:  the  characteristic  of  task  workload  or 
reserve  capacity  .  .  .  Even  when  the  resources  are  adequate,  the  atten¬ 
tion  must  be  allocated  properly  to  the  critical  tasks,  displays,  or 
sources  of  information,  so  that  Important  sources  are  not  Ignored:  the 
characteristic  of  attention  allocation.  The  distinction  between  work¬ 
load  and  allocation  are  crucial." 


Chapter  13:  "The  assessment  of  pilot  workload  la  a  special  case  of  the  measurement 

of  Information-processing  load,  the  aggregated  demands  placed  upon  an 
Individual  in  the  performance  of  a  particular  cognitive  task  or 
function.  Three  general  approaches  have  been  employed  in  the  measure¬ 
ment  of  information-processing  load.  The  first  Is  that  of  subjective 
estimation.  Subjective  estimates  are  involved  when  workload  Is  esti¬ 
mated  from  the  task  engineer's  opinion  as  to  the  probable  magnitude  of 
processing  load,  an  opinion  that  may  be  based  on  previous  experience 
or  an  analytic  theory.  However,  subjective  estimates  of  workload  by 
the  user  or  participant  are  the  most  coooon  form  of  workload  measure¬ 
ment  in  aircraft  design.  .  .  The  second  major  method  of  measuring 
processing  load  employs  behavioral  measurement.  Here  the  notion  is 
that  the  information-processing  capacity  of  an  Individual  is  limited 
so  that  the  workload  Imposed  by  one  task  can  be  estimated  by  the  degree 
to  which  it  Interferes  with  the  simultaneous  execution  of  a  secondary 
measurement  task,  such  as  simple  reaction  time  or  manual  tracking.  .  . 
The  third  major  method  is  physiological,  in  which  the  response  of  the 
nervous  system  to  the  load  Imposed  by  an  information-processing  task 
is  assessed.  Momentary  Increases  in  processing  load  Induce  short- 
latency,  short-lived  Increases  In  measures  of  central  nervous  system 
activation.  These  changes  are  most  evident  and  most  easily  measured 
In  the  autonomic  nervous  system.” 

Chapter  14:  "Of  prime  Importance  to  research  workers  dealing  with  aviator  work¬ 

load,  stress  and  fatigue  is  the  intriguing  notion  of  an  on-line  pilot 
monitor  system  during  air  combat  missions.  Long  considered  to  be  one 
of  the  more  stressful  and  demanding  pilot  tasks,  an  air-to-air  engage¬ 
ment  taxes  the  pilot  physically,  mentally,  and  perceptually.  The 
possibility  of  complementing  on-line  pilot  performance  measures  with 
on-line  physiological  measures  such  as  heart  rate,  blood  pressure, 
etc.,  would  provide  an  Ideal  arrangement  for  the  research  team  inter¬ 
ested  in  validating  laboratory  notion  of  stress,  fatigue,  or  workload 
In  an  operational  'real  world'  environment.  A  word  of  caution  is 
advised.  Some  research  teams  used  to  the  controls  and  precision 
design  of  experiments  in  the  laboratory  will  be  limited  In  their 
attempts  to  control  the  real  world.  But  that  is  exactly  the  point. 

Many  laboratory  studies  stress  tbs  statistical  significance  of  results 
without  strong  support  for  practical  or  operational  significance.  In 
pilot  workload,  for  example,  the  amount  or  severity  of  workload  in 
either  a  24-hour  or  flight  segment  is  certainly  useful  to  'describe' 
the  environment  but  does  not  by  Itself  have  any  practical  significance 
unless  it  can  be  related  to  performance  effectiveness,  short  or  long 
term.  Our  physiological  reactions  to  stress  or  workload  can  assuredly 
be  measured  but  it  is  only  in  the  context  of  their  relation  to  per¬ 
formance  that  they  acquire  operational  significance." 

Chapter  15:  "The  major  concern  of  human  engineering  has  been  tr  develop  command 

and  control  systems  wherein  better  displays  and  mors  functional  con¬ 
trols  would  enable  the  controller  to  better  perform  his  d mending  task 
and  ultimately  render  it  less  stressful.  Basic  to  this  concern  has 
been  an  attempt  to  define  the  controller's  task  and  to  identify  certain 
aarospace  events,  such  as  maker  of  aircraft,  aircraft  speed,  control 
sector  site,  etc.,  which  nay  be  crucial  factors  in  the  controller’s 
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Job  performance.  However,  such  ■ Cud lee  have  aerved  only  to  point  out 
that  the  real  need  In  evaluating  the  efficiency  of  control  systems,  or 
of  the  operator  himaelf ,  la  the  eatabliahawnt  of  relevant  criterion 
measures.  Studies  in  thla  area,  to  date,  have  deaonatratad  that  alaple 
measures  of  various  aerospace  events  which  comprise  the  controller's 
workload  do  not  fully  relate  to  the  ccxsplex  stresses  that  are  experi¬ 
enced  in  the  job  performance." 

Chapter  17:  "The  workload  experienced  by  air  traffic  controllers  (ATCS)  is  difficult 

to  define.  One  may  consider  imposed  load  objectively  in  terms  of  num¬ 
bers  of  aircraft  handled,  but  the  subjective  load  perceived  by  the 
controller  may  be  a  greatly  different  quantity.  Many  factors  may  oper¬ 
ate  as  workload  modifiers  either  making  the  work  easier  or  more  diffi¬ 
cult:  (1)  Type  of  traffic  handled.  One  aircraft  in  distress  may  cause 
more  "work"  than  all  the  other  traffic  being  handled.  (2)  Heather. 

Controllers'  perceived  workload  always  Increases  when  pilots  cennot  main¬ 
tain  visual  separation  In  Instruments’  meteorological  conditions. 

(3)  Equipment  outages  and  malfunctions  causing  reversion  to  manual  methods 
of  control.  (4)  Disruption  of  circadian  rhyttaas  caused  by  rotating  shifts, 
and  (5)  General  physical  and  emotional  conditions  resulting  from  a  variety 
of  off-duty  activities  and  on-duty  problems  with  management  or  peers." 

Chapter  18:  "The  assessment  correlates  of  workload,  performance,  and  stress  can  be 

divided  Into  several  areas:  those  of  physiological  correlates,  psycho¬ 
logical  correlates,  stress  correlates,  psychophyslologlc  correlates,  and 
finally,  central  nervous  system  (CHS)  correlates.  He  realize  that  this 
Is  an  artificial  taxonomy  and  that  many  areas  of  overlap  exist. 

Several  problems  are  demonstrated  in  these  extracts.  First,  It  is  immediately  apparent  that  no  single 
definition  exists.  Second,  even  when  contributors  are  limited  to  the  biotechnology  (aerospace  medicine  and 
supporting  disciplines)  comunlty,  a  diversity  of  definitions  and  approaches  emerge.  Third,  there  is  a 
substantial  overlap  between  subelements  of  a  biotechnology  definition,  e.g.,  between  physiology,  psycho¬ 
physiology,  psychology,  etc.  The  range  of  definitions  and  approaches  will,  obviously,  increase  as  the 
engineering  community  makes  its  inputs  into  the  issues  of  definitions  and  approaches.  This  report  will.  In 
an  attempt  to  maintain  a  simple  framework,  focus  on  what  appears  to  this  writer  to  be  the  most  common  ele¬ 
ments  of  the  problem  as  viewed  by  aerospace  medicine: 

a.  Horkload ,  fatigue  and  stress  are  different  aspects  of  a  larger  problem;  the  larger  problem 
is  that  of  maintaining  aircrew  performance  at  acceptable  levels. 

b.  There  probably  Is  no  way  to  separate  workload,  fatigue,  and  stress  In  terms  of  definition, 
measurement  approaches,  or  research  strategies. 

c.  So  far  as  physiologic  mechanisms  are  concerned,  the  body  doesn't  know  or  care  which  of  the 
three  It  Is  responding  to. 

Our  approach  to  the  workload  problem  can,  therefore,  be  described  graphically  In  figure  1,  below. 
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II.  ORGAMIZXHG  CCHCEPT8 


Before  proceeding  to  a  technical  review  of  the  many  aeaauree  and  methods  appropriate  to  workload.  It 
will  be  ueeful  to  conalder  aoae  behavioral  lletlnga,  categorlea,  claaalf lcatlona,  aetrlca,  etc.  Theae 
are  offered  to  give  the  reader  aoae  organizing  concepta  aa  well  aa  a  preview  of  the  coaplexlty  of  the 
aeaaureaent  problea.  Of  the  following  5  tablet,  4  cone  froa  Chapter  7  of  the  AfiARDograph  (Mo.  246)  to 
which  thla  technical  evaluation  report  la  addreaaed  and  the  laat  coaea  froa  ACARD  Conference  Proceedings 
CP-216.  Tables  3  and  4  not  only  provide  overview  klnda  of  natrlcee,  but  that  each  cell  Is  annotated  to 
Indicate  evaluations  by  the  authors  of  Chapter  7,  W.  W.  Hlerwllle,  K.  C.  Wllllgee,  and  S.  G.  Schlflett. 


Table  1* 

Classification  of  Universal  Operator  Behavior  Dlaenalon 
(After  Berliner,  Angell,  and  Shearer,  1964) 


Proceasea 


1.  Perceptual 
processes 


Activities 


j  1.1  Searching  for  and  receiving 
lnfornation 
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1“ 


Identifying  objects,  actions, 
events 


C  2.1  Information  processing 
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Specific  Behaviors 
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1.1.1 

1.1.2 

1.1.3 

1.1.4 

1.1.5 

1.1.6 

1.1.7 

1.2.1 

1.2.2 

1.2.3 


Detects 

Inspects 

Observes 

Beads 

Receives 

Scans 

Surveys 

Discriminates 

Identifies 

Locates 


2.1.1  Categorizes 

2.1.2  Calculates 

2.1.3  Codes 

)  2.1.4  Computes 

2.1.5  Interpolates 

2.1.6  Itemizes 

2.1.7  Tabulates 

2.1.8  Translates 


2 .  Medlatlonal 
processes 


2.2  Problem  solving  and 
I  decision-making 


3.  Communication 
processes 


f  4.1  Slmple/Dlscrete 


4.  Motor  processes 


< 


r 


v. 

r 


< 


2.2.1 

2.2.2 

2.2.3 

2.2.4 

2.2.5 

2.2.6 

2.2.7 

3.1 

3.2 

3.3 

3.4 

3.5 

3.6 

3.7 

3.8 

3.9 


Analyzes 

Calculates 

Chooses 

Compares 

Computes 

Estimates 

Plans 

Advises 

Answers 

Comunlcates 

Directs 

Indicates 

Informs 

Instructs 

Requests 

Transmits 


< 


4.1.1 

4.1.2 

4.1.3 

4.1.4 

4.1.5 

4.1.6 

4.1.7 

4.1.8 


Activates 

Closes 

Connects 

Disconnects 

Joins 

Moves 

Presses 

Sets 


j  4.2  Complex/ 
j  Continuous 


4.2.1  Adjusts 

4.2.2  Aligns 

4.2.3  Regulates 

4.2.4  Synchronizes 

4.2.5  Tracks 


♦This  and  the  next  three  tables  are  from  ACARD-AG-246,  "Survey  of  Methods  to  Assess  Workload,"  Chapter  7, 
"Aircrew  Workload  Assessment  Techniques." 
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Table  3 

Applicability  Matrix  of  Workload  Methodologies  Across 
Universal  Operator  Behaviors 


1.1  _ Eating  Scales _ 

1.2  _ Interviews  and  Questionnaires 

2.1.1  Task  Component.  Time  Simulation 

2.1.2  Information-Theoretic _ 

2.2.1  Honadaptlve,  Arlth. /Logic 

2.2.2  Konadaptive,  Tracking _ 

2.2.3  Time  Estimation _ 

2. 2. A  Adaptive,  Arlth. /Logic 

2.2.5  Adaptive.  Tracking _ 

2.3  _ Occlusion _ 

3.1  Single  Measure-Primary _ 

3.2  Multiple  Measure-Primary _ 

3.3  Math.  Modeling _ 

4.1.1  FFF _ 

A. 1.2  GSR _ 

A. 1,3  EKC _ 

A.l.A  EMG _ 

A. 1.5  EEC _ 

A. 1.6  ECP _ 

A, 1.7  Eve  and  Eyelid  Movement _ 

A. 1.8  Pupillary  Dilation _ 

A. 1.9  Muscle  Tension.  Tremor _ 

A. 1.10  Heart  Rate,  Heart  Rate 

_ Variability.  Blood  Pressure 

A. 1.11  Breathing  Analysis _ 

A. 1.12  Body  Fluid  Analysis _ 

A. 1.13  Handwriting  Analysis _ 

A. 2 _ Combined  Physiological  Measure 

A. 3 _ Speech  Pattern  Analysis _ 
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Weightings 
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3  ■  Well  documented  research  support 


A. 2  Complex/Continuous  Motor 
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Table  5 

Workload  Measurement  Methodology  Matrix 


A.  in  a  model 

1. 

Measures  of 

system 

Performance 

2. 

Measures  of 
pilot 

Performance 

3. 

Analogues 
of  pilot 
Performance 

4. 

Measures  of 
pilot  status 

Examples 

ton-miles 
kill  ratios 
attrition 

B.  In  a 

laboratory 

reaction  time 
tracking  scores 
perceptual 
efflc. 

C.  in  a 

simulator 

1 

procedural 
error 
eaergency 
response 
glide  path 
deviation 

D.  In  the 
field 

1 

sortie  rate 
ln-commlsaion 
rate 

cargo  pass- 
thru  time 

E.  In 

flight 

flight  path 
deviations 
eye  movement 
patterns 
crew  activity 

Examples 

altitude 

control 

navigation 

gunnery 

scores 

control 

movements 

visual 

scanning 

coomunlca- 

tion 

pilot 

opinion 

synthetic 

tasks 

traditional 

tasks 

secondary 

tasks 

neurophys. 

status 

blochem. 

status 

' 

1 

i 

III.  PSTCHOPHTSIOLOCIC  MEASURES 


Substantial  progress  has  been  aade  during  the  past  decade  In  psychophyslologlc  measurement ,  method¬ 
ology,  Instrumentation,  and  analytic  techniques.  Less  progress  has  occurred  In  elegant  explanations  of 
mechanisms.  Nevertheless,  our  considerably  Improved  ability  to  observe,  record,  quantify,  and  interpret 
psychophyslologlc  events  and  activities  makes  this  an  area  with  substantial  potential  for  the  assessment 
of  pilot  workload.  Two  significant  problems  must  be  resolved,  however ,  before  that  potential  can  be 
realized; 

a.  Development  of  field-qualified,  cockpit-qualified  devices  for  acquiring  data;  and 

b.  Validated  relationships  between  what  are  sonetlmee  rather  subtle  psychophyslologlc  events 
and  Important  workload  conditions  and/or  effects. 

The  pace  at  which  these  two  problems  are  being  investigated,  combined  with  a  scattering  of  recent  successes, 
suggests  that  we  should  be  optimistic.  An  Important  role  is  being  played  by  present  and  expanding  capabil¬ 
ities  provided  by  mini-  and  micro-computers  as  we  pursue  the  application  of  psychophysiology  to  operational 
problems.  The  following  paragraphs  will  provide  short  overviews  of  several  areas  of  psychophyslologlc 
measurement.  The  Interested  reader  should  examine  AGARDograph  No.  244,  "Contributions  of  Psychophyslologlc 
Techniques  to  Aircraft  Design  and  Other  Operational  Problems,"  by  R.  D.  O'Donnell  (July  1979),  as  well  as 
various  chapters  In  AGARDograph  No.  246,  "Survey  of  Methods  to  Assess  Workload,"  edited  by  B.  0.  Hartman 
and  R.  E.  McKenzie  (August  1979),  which  this  Technical  Evaluation  Report  specifically  addresses. 

EMS.  Electromyographic  measures  have  both  virtues  and  limitations.  EMG  Is  easy  to  record,  and  there 
is  more  than  enough  evidence  to  support  the  proposition  that  with  increasing  effort  there  Is  Increasing 
muscle  tone,  and  therefore  Increasing  EMG.  There  is  reasonable  evidence  that  muscle  tone  (and  EMG) 
increases  as  workload  increases;  the  effects  can  be  seen  with  either  mental  or  physical  workload.  There 
Is  an  easily  observed  relationship  between  DC  changes  and  motor  activity  and/or  other  "physical"  criteria 
of  work.  Recent  interest  and  research  have  focused  on  biofeedback  applications,  and  the  assessment  of 
states  of  alertness  or  arousal.  Field-qualified  Instrumentation  Is  within  the  state-of-the-art.  Plight- 
qualified  Instrumentation  Is  within  reach.  Two  limitations  need  to  be  considered:  (a)  there  Is  Increased 
muscle  tone  resulting  from  "useless"  work  (consider  the  difference  In  muscle  tone  between  a  student  pilot 
and  an  Instructor  when  stalls/spins  are  first  presented);  and  (b)  we  do  not  yet  have  a  "co-llnear"  scale 
for  EMG  changes  vs.  workload. 

GSR/BSR.  Some  of  the  cosnents  on  EMG  can  be  aade  regarding  the  galvanic  skin  resistance  and  the 
related  basal  skin  resistance.  GSR/BSR  are  reasonably  easy  to  acquire  and  there  is  a  long  history  to  sup¬ 
port  the  proposition  that  changes  in  mental  and  motor  activity  will  be  reflected  In  GSR/BSR.  Biofeedback 
applications  are  cmmaon.  As  cessment  of  states  of  alertness /arousal  can  be  done.  Mini-  and  micro-computer 
technology  will  facilitate  research  progress.  Field-qualified  Instrumentation  Is  within  reach;  flight- 
qualified  Instrumentation  will  be  more  difficult.  The  absence  of  co-llnear  scales  for  GSR/BSR  and 
alertness/arousal/workload  Is  a  problem.  However,  the  more  significant  limitation  Is  the  confusion 
regarding  terminology  and  methodology,  coupled  with  confusion  and  difficulties  on  Interpretation.  Perhaps 
It  Is  sufficient  to  say  that  GSR/BSR  reflects  some  kind  of  "activation"  but  there  Is  need  for  more  research 
before  this  measure  Is  a  good  candidate  for  workload  assessment. 

Cardiovascular ■  We  will  deal  with  cardiovascular  measures  as  a  "package"  at  this  time.  Heart  rate  per 
se  Is  discussed  in  detsll  In  Chapters  5,  7,  and  11  of  AGARDograph  246.  The  kinds  of  measures  comaonly 
obtained  Include  blood  pressure,  stroke  volume,  blood  oxygen  levels  as  determined  by  nonlnvaslve  measures 
such  as  ear  oximetry,  and  heart  rate.  These  measures  are  reasonably  easy  to  acquire,  field-qualified 
instrumentation  la  within  the  state-of-the-art,  and  cockplt-quallfled  instrumentation  is  within  reach, 
generally  speaking.  There  is  controversy  regarding  theory,  mechanisms,  findings,  and  applications  when  one 
departs  from  classical  cardiovascular  physiology  to  an  applications  area  as  operational  as  pilot  workload, 
though  some  skillful  applied  researchers  do  well  in  addressing  such  controversy.  This  measurement  area  Is 
also  characterized  as  one  where  there  Is  some  "elegance"  In  the  analysis  procedures,  particularly  for  vari¬ 
ous  fragmentations  of  the  EKG  waveform.  This  author  Is  skeptical  about  such  analyses,  which  may  yield  a 
low  payoff  for  the  manhour  Investment,  chough  analytic  power  provided  by  computer  technology  may  resolve 
this  aspect  of  elegant  analyaes. 

Brain  Function.  Again,  we  will  deal  with  these  measures  ss  a  "package.”  KEG  (electroencephalograph) 
and  ER  (evoked  responses)  are  prominent  In  this  package,  with  lnterhemlspherlc  assessment  showing  some 
progress.  This  Is  a  measurement  domain  where  elegance  In  analysis  is  commonplace  and  where  computer  tech¬ 
nology  is  Indispensable.  There  Is  a  marked  upsurge  in  applications  of  evoked  responses,  particularly  the 
VER  (visual).  The  changes  In  this  domain  frequently  relate  poorly  to  changes  In  other  automatic  measures, 
posing  (perhaps)  a  problem  for  the  investigator  with  a  multi-measure  battery.  The  measures  clearly  have 
high  utility  for  low  versus  normal  arousal  levels.  It  la  O'Donnell’s  position  (AGARDograph  244)  that 
these  are  the  most  powerful  of  psychophyslologlc  measures.  Field-qualified,  and  to  some  extent,  cockplt- 
quallfled  Instrumentation  Is  within  the  state-of-the-art. 

Visual  Measures.  Measures  of  visual  function  have  specific  utilities  and,  for  most,  a  reasonably 
impressive  history  of  successful  applications.  Included  are  eye  movements  (BOG) ,  pupil  size,  and  point 
of  regard.  Elegance  of  Instrumentation  is  customary,  though  not  always  essential.  Data  reduction  can  be 
laborious,  particularly  where  slsple  Instrumentation  la  employed,  although  the  ability  of  computers  to 
"recognize"  wave  forms  can  be  profitably  employed.  Experimental  methods  and  the  experimental  environment 
can  be  demanding,  and  can  pose  problems  where  fleld-quallfled/cockplt-qualifled  instrumentation  Is  desired. 
The  eye  "point  of  regard,"  while  an  extremely  specialized  measure,  usually  employed  to  assess  cockpit 
panel  design  or  the  more  fundamental  scanning  pattern,  has  a  real  potential  for  workload  applications.  Of 
Importance  hare  would  be  changes  In  scanning  pattern  as  variations  In  workload  occur,  e.g.,  the  elimination 
of  non-essential  scanning  elements  under  conditions  of  high  workload.  Workload  applications  have  been  lim¬ 
ited  to  date,  but  the  potential  Is  good,  and  the  measure  has  the  advantage  of  high  reliability  and  stabili¬ 
ty  when  appropriate  Instrumentation  Is  employed. 


Psychophysiology  and  Sensory  Function.  A  variety  of  psychophyslologlc  measures  are  available  to 
assess  sensory  function.  Included  are  the  VER,  MTFA  (Moderation  Transfer  Function  Area),  CFF  (Critical 
Flicker  Fusion),  visual  acuity,  contrast  sensitivity,  color  vision,  and  auditory  measures.  As  Indicated 
earlier,  VER  Instrumentation,  methodology,  and  the  experimental  environment  are  fairly  demanding.  Measures 
which  yield  both  transient  and  steady  state  information  are  required.  Its  unique  significance  Is  that  It 
Is  the  final  representation  of  a  chain  of  Intervening  processes  (O'Donnell),  while  also  offering  the  skilled 
Investigator  the  opportunity  to  fractionate  that  process  into  behavioral  aspects  of  special  interest,  such 
as  the  effect  of  task  errors  In  central  processing.  There  have  been  recent  applications  of  VER  to  the  eval¬ 
uation  of  different  displays,  with  reasonable  success.  MTFA  Is  an  alternate  approach  to  VER.  Visual  acui¬ 
ty,  contrast  sensitivity,  and  color  vision  have  a  long  history  of  clinical  applications,  but  applications 
in  the  field  on  operational  problems  will  require  new  methods  and  Instrumentation.  CFF  can  be  described  in 
a  similar  way:  a  long  history  of  successful  clinical  applications  and  experimental  applications  In  problems 
of  fatigue  and  environmental  stress,  but  a  need  for  new  methods  and  Instrumentation  If  field  applications 
are  the  goal.  It  is  doubtful  at  this  time  that  a  cockpit-qualified  capability  will  emerge.  Measures  of 
auditory  function  demand  strict  methods  and  Instrumentation.  There  is  the  additional  problem  of  a  fair 
degree  of  intra-subject  variability.  O'Donnell  points  out  that  a  variety  of  "psychophyslologlc  bridges” 
are  now  being  employed  in  auditory  measurement,  such  as  GSR  and  VER.  The  possibility  of  "bridged”  measures 
in  applications  batteries  is  an  Intriguing  prospect.  However,  the  potential  of  auditory  measures  for  field 
applications  must  be  viewed  with  caution  because  of  the  methodologlc  and  experimental  demands  which  such 
measurement  impose. 

Psychophysiology  and  Cognitive  Function.  It  appears  that  we  are  on  the  edge  of  substantial  advances 
in  the  ability  to  assess  cognitive  function,  Including  field  and  perhaps  even  cockpit  measurement  capabil¬ 
ities.  There  is  presently  a  fair  amount  of  laboratory  activity  on  GSR,  EEG,  and  pupllometry.  VER  Is 
emerging  as  a  useful  tool  for  quantifying  central  processing  and  decision-making.  There  is  provocative 
research  underway  on  interhemispheric  measurement.  Laboratory  enhancements  of  signal  detection  and 
reaction  time  measurement  are  underway.  Including  physiologic  "bridges”  to  cardla  deceleration  and  evoked 
potential.  VER  has  good  potential  for  the  analysis  of  subtle  response  errors  which  are  not  quantified  by 
other  measurement  techniques.  The  prospects  are  exciting. 

Psychophysiology  and  Attention/Vigilance.  The  consents  above  on  cognitive  function  apply  generally 
to  the  functional  area  of  attention  and  vigilance.  The  use  of  GSR  specifically  Is  a  function  of  how  one 
conceptualizes  attention  and  vigilance.  If  the  concept  is  a  general  state  of  arousal  lasting  for  a  fairly 
long  time,  GSR  has  utility.  If  the  concept  la  more  event-related,  then  GSR  Is  too  slow  to  be  of  much 
value.  The  utility  of  EMG  can  be  similarly  conceptualized.  It  has  particular  value  as  a  measure  of 
preparation  for  motor  activity. 

Psychophysiology  and  Workload.  The  most  cocoon  of  the  psychophyslologlc  measures  is  heart  rate. 
Particular  Interest  is  focused  on  variability  In  rate  (sometimes  identified  as  changes  In  sinus  arrhyth¬ 
mia).  The  frequent  but  not  universal  finding  la  a  reduction  In  heart  rate  variability  as  workload 
Increases.  There  are,  as  was  discussed  earlier,  more  elegant  analytic  treatments  of  the  EKG  waveform,  but 
rate  per  se  has  the  demonstrated  value  of  applicability  across  a  large  range  of  tasks.  Brain  wave  activity 
la  another  psychophyslologlc  measurement  domain  for  workload.  Of  the  several  analytic  aspects,  VER  appears 
most  useful,  particularly  the  late  positive  components.  Where  VER  Is  coupled  with  a  noninterference  secon¬ 
dary  task,  the  utility  of  VER  promises  to  be  even  greater.  Pupil  dilation  also  holds  some  promise,  prob¬ 
ably  more  for  field  application  than  for  Inflight  (cockpit)  application.  Pupil  dilation  seems  particularly 
applicable  where  workload  capacity  Is  exceeded,  though  perhaps  graded  changes  In  pupil  size  can  be  related 
to  graded  variations  In  workload.  The  limitations  for  field  application  reside  In  the  somewhat  demanding 
requlresmnt  to  control  the  visual  environment  and  Instrumentation  (Illumination,  eye  movement,  etc.). 

Voice  analysis  has  high  face  validity,  with  analysis  addressing  both  pitch  and  formant  aspects.  To  date, 
however,  applications  to  workload  specifically  have  been  limited.  There  are  other  problems.  The  analysis 
Is  complex,  data  collection  methods  require  careful  control,  and  analytic  Instrumentation  and  software  are 
demanding.  The  net  result  la  a  substantial  possibility  that  voice  analysis  can  be  a  source  of  erroneous 
data  on  workload.  GSR,  EMC,  and  CFF  have  yielded  mixed  results  In  workload  applications,  but  have  good 
potential.  The  cautions  on  these  methods  which  have  been  stated  earlier  apply  particularly  to  workload. 

A  promising  approach  not  yet  Implemented  Is  the  application  of  multiple  regression  analysis  techniques  to 
the  psychophyslologlc  assessment  of  workload. 


IV.  SUBJECTIVE  MEASURES 

In  the  workload  area,  subjective  measures  are  a  way  of  obtaining  reports  from  subjects  regarding  per¬ 
ceptions,  effects  and  feelings  concerning  the  Imposed  burden.  The  approaches  to  soliciting  such  reports 
can  be  broadly  categorised  as  rating  scales,  questionnaires,  and  Interviews.  Each  of  the  three  approaches 
can  be  further  categorised  as  structured  or  unstructured.  There  Is  a  reluctance  on  the  part  of  some  work¬ 
load  researchers  to  accept  subjective  measures  simply  because  they  are  not  objective.  The  counter  argument 
is  simply  that  a  significant  aspect  of  workload  Is  one’s  Internal,  personal,  subjective  experience  for 
which  a  subjective  report  has  high  face  validity.  A  second  argusMut  against  subjective  measures  Is  large 
variance.  An  appropriate  response  Is  to  point  out  the  necessity  of  applying  rigorously  the  psychometric 
rules  for  developing  such  Instruments,  as  well  as  providing  dear  Instructions  and  definitions,  training 
subjects,  and  even  calibrating  Individual  subjects  agalnat  group  means.  A  final  argument  Is  that  sub¬ 
jective  data  do  not  always  agree  with  objective  data.  True.  Perhaps  we  should  examine  and  try  to  under¬ 
stand  the  differences,  rathsr  than  categorically  rejsctlng  the  subjective  measures.  The  one  argument 
against  which  there  la  little  defense  concerns  the  compromise  of  data  when  a  subject  responds  with  a  bias 
he  fully  Intends  to  Inject  Into  the  study,  or  randomly  because  he  Is  disinterested. 

Rating  scalas  are  unique  among  subjective  measures  because  they  yield  a  score  which  Is  a  point  on  a 
dimension  defined  by  the  Investigator.  Rating  scalaa  fall  generally  Into  two  categories.  Thera  are  those 
which  add  up  "scores"  on  a  serlea  of  Items,  with  the  sum  determining  that  point  on  the  scale  (dimension) . 

The  subject  may  have  a  good  qualitative  perception  of  where  he  scores  on  that  scale,  but  he  usually  does  not 
know  his  score  per  se.  There  are  those  which  load  a  subject  stepwise  through  a  series  of  reports  to  a 
final,  standardised  appraisal  on  acme  workload  Issue.  In  this  case,  the  subject  knows  clearly  where  he 


scores  on  that  scale,  and  might  in  fact  have  reported  a  somewhat  less  standardised  appraisal  without  the 
step-wise  guidance  provided  by  the  rating  form.  Because  of  this  feature,  some  Investigators  eliminate 
the  step-wise  guidance  and  have  the  subject  simply  select  a  standardised  opinion  equivalent  to  a  point  on 
the  scale.  There  are  benefits  to  be  obtained  from  the  step-wise  guidance,  however.  Properly  designed, 
such  a  rating  form  also  provides  Insight  Into  what  aspects  of  a  task  led  to  the  workload  rating  of  "x." 

The  Cooper-Harper  scale  on  handling  qualities  Is  the  most  commonly  used  example  of  this  approach,  and  in 
the  hands  of  trained,  "standardized"  subjects  is  a  powerful  tool  with  high  face  validity.  It  Is  unques¬ 
tionably  a  good  model  for  such  an  approach.  The  MIT  Flight  Transportation  Laboratory  produced  in  1979  a 
workload  rating  scale  modeled  after  Cooper-Harper,  with  some  elaborations.  The  power  of  rating  scales  is 
augmented  when  other  measures  are  also  obtained,  particularly  where  such  other  measures  (e.g. ,  EKG,  urine 
samples,  etc.)  suggest  to  the  subjects  that  there  Is  little  to  be  gained  from  Injecting  a  bias  into  the 
data.  Before  leaving  rating  scales,  we  should  acknowledge  the  utility  of  having  subjects  simply  put  a 
mark  on  a  line  (which  has  well  defined  anchor  points).  It  is  a  simple,  essentially  self-scoring  procedure 
which  correlates  reasonably  well  with  other  subjective  reporting  approaches. 

Questionnaires  offer  an  opportunity  to  probe  into  multiple  aspects  of  a  workload  Issue.  Where  multi¬ 
ple  choice  answers  (categories  or  scaled)  are  provided,  there  is  the  appearance  of  objectivity.  The  util¬ 
ity  of  such  questions  can  be  Improved  by  the  application  of  scaling  techniques  (such  as  "semantic  differ¬ 
ential"),  which  is  a  technology  In  its  own  right.  Effective  questionnaires  must  be  based  on  a  careful 
analysis  of  the  task  under  study,  or  on  extensive  background  studies  of  the  aspects  of  a  task  which  present 
problems  to  the  performers,  or  on  some  other  method  which  is  exercised  with  rigor.  Open-ended  question¬ 
naires  are,  nevertheless,  an  option,  and  have  the  advantage  of  offering  Insights  perhaps  not  otherwise 
obtained  although  at  the  expense  of  cumbersome  scoring  (If  a  score  is  the  objective). 

Much  of  what  has  been  said  about  questionnaires  applies  to  Interviews.  In  addition.  Interviews  pro¬ 
vide  further  opportunities  for  Insights,  since  responses  which  are  not  entirely  clear  can  be  pursued  with 
further  questions.  There  Is  also  the  advantage  of  being  able  to  peruse  global  feelings  and  attitudes 
which  might  Influence  responses,  but  the  process  Is  costly  In  terms  of  manhours. 


V.  PERFORMANCE  MEASURES 

Performance  measures  can  be  grouped  Into  the  broad  classes  of  mission  performance,  weapons  system 
(aircraft)  performance,  primary  pilot  performance,  secondary  pilot  performance,  and  laboratory  task  per¬ 
formance.  With  the  exception  of  mission  performance,  these  are  familiar  to  workload  researchers,  and 
have  as  their  focus  the  pilot  and  his  tasks . 

Mission  performance.  This  class  of  performance  measures  is  defined  as  the  tasking  assigned  to  a 
unit  (squadron/wlng)  and  refers  to  unit  workload.  Therefore,  the  workload  data  cannot  be  traced  back  to 
Individuals  without  considerable  effort.  Examples  would  be  a  tactical  sortie  surge  (an  exercise  requiring 
"x"  number  of  sorties  to  be  flown  over  "y"  days)  or  an  airlift  exercise  (move  "x"  Army  troops  and  equip¬ 
ment  from  point  a  to  point  b  In  "y"  days),  or  more  routinely  the  monthly  flying  schedule  for  any  kind  of 
unit.  Workload  researchers  rarely  address  unit  workload,  mostly  because  of  the  emphasis  on  the  Individual 
pilot  and  the  burden  imposed  on  him  by  his  aircraft  systems  and  sortie  tasks.  Approaches  to  unit  workload 
Include  both  field  studies,  where  multi-measure  batteries  are  employed,  and  computer-based  slaulatlon 
models.  While  unit  workload  translates  rather  concretely  Into  flight  schedules  and  therefore  Into  aircrew 
schedules.  It  would  not  be  Inappropriate  to  set  It  aside  In  view  of  the  already  large  problem  confronting 
the  workload  research.  There  Is,  however,  one  sspect  worth  considering.  Existing  models  have  or  can 
accommodate  behavioral  and  physiologic  variables,  and  can  therefore  be  exercised  In  a  parametric  fashion 
(sensitivity  analyses)  to  Identify  those  variables  which  potentially  compromise  unit  capability.  This,  In 
turn,  would  provide  external  criteria  to  be  used  to  focus  and  prioritise  workload  research. 

Weapons  system  performance.  The  focus  here  Is  on  task  outcomes,  e.g.,  destroy  a  target,  conduct  an 
electrosiagnetlc  survey  of  a  potential  target  area,  approach  and  land.  These  examples  surface  an  area  of 
controversy:  does  It  matter  that  pilotage  was  degraded  If  the  mission  or  mission  element  Is  completed 
successfully?  Is  a  pilot  really  overloaded  If  he  successfully  performs  his  task?  We  will  examine  this 
Issue  more  fully  In  the  discussion  on  primary  pilot  performance.  At  any  rate.  It  is  clear  that  objective, 
quantified  measures  of  success  can  be  acquired.  A  number  of  techniques  can  be  used  to  acquire  such 
measures.  The  most  elaborate  uses  an  Instrumental  range  such  as  that  described  In  chapter  14,  with 
elaborate  ground  tracking  and  recording  playback  systems ,  transponders,  and  telemetering  systems  In 
aircraft,  etc.  The  data  mass  from  such  a  facility  can  be  staggering,  but  with  careful  study,  one  can 
Isolate  and  acquire  measures  of  direct  Interest  from  a  workload  point  of  view.  A  unique  advantage  Is 
the  enhancement  of  communication  with  operational  personnel,  since  the  measures  are  meaningful  to  than. 
Through  close  coordination  and  cooperation,  It  is  possible  to  create  a  library  of  data  from  such  test 
ranges,  though,  once  again,  the  mass  of  data  can  be  overwhelming.  Next  In  level  of  sophistication  Is  an 
aircraft-mounted  instrumentation  pod,  frequently  available  on  test  aircraft  and  sometimes  with  telemeter¬ 
ing  capability.  An  extensive  range  of  aircraft  and  pilot  performance  measures  can  be  obtained,  and 
frequently  recording  channels  for  psychophyslologlc  or  special  performance  measures  can  be  obtained.  The 
element  missing  here  Is  what  we  might  call  "terminal"  mission  measures,  such  as  miss  distance  on  mlsalle 
Impact.  It  should  be  noted  that  a  computer-driven,  air craft-noun tad  visual  display  system  under  develop¬ 
ment  at  the  Air  Force  Aerospace  Medical  Research  Laboratory,  called  VCASS,  has  the  potential  for  providing 
such  data,  since  the  computer  portion  of  the  system  can  generate  simulated  data  on  these  kinds  of  measures. 
Chase  planes  and  ground  observers  are  more  conventional,  but  expensive  and  somewhat  gross  techniques  for 
obtaining  terminal  mission  measures. 

Primary  pilot  performance.  Primary  pilot  performance  measures  are  those  obtained  as  the  pilot  per¬ 
forms  bis  tasks.  Early  versions  were  measures  such  as  stick  and  rudder  movement,  button  pressing  of  vari¬ 
ous  sorts,  radio  communication  activity,  and  so  on.  The  "goodness"  of  performance  determination  is  based 
on  comparisons  with  externally  generated  standards,  with  baselines  on  the  same  or  similar  groups  of  pilots, 
or  on  "early  vs.  late"  In  the  mission.  Hypotheses  regarding  fatigue  or  workload  are  employed,  perhaps 
without  adequate  validation.  Mora  recent  approaches  have  involved  development  of  "categories"  of  pilot 
performance,  such  as  ere  shown  in  Table  1  of  this  Technical  Evaluation  Report.  What  constitutes  an  ade¬ 
quate  enaanbla  of  categorlea?  Some  approaches  are: 


a. 

Input 

a. 

perceptual  processes 

b. 

central  processing 

b. 

medlational  processes 

c. 

output 

c. 

communication  processes 

d. 

motor  processes 

a. 

preflight  activity 

a. 

activity  time 

b. 

Inflight  activity 

b. 

errors 

c. 

post  flight  activity 

c. 

describing  functions 

d. 

channel  capacity 

These  are  only  samples  from  a  large  family  of  categories.  In  that  family  are  factor-analytic  and  similar 
kinds  of  categorization  more  appropriate  to  laboratory  research.  Therefore,  this  issue  will  be  discussed 
again.  The  Issue,  however,  is  critical.  How  does  one  conceptualize  workload  measures?  What  Is  the 
relationship  between  that  conceptualization  and  the  measurement  battery,  the  data  collection  facility  or 
environment,  the  hypotheses  to  be  tested,  and  the  applications  which  need  to  be  implemented?  There  Is  the 
very  practical  aspect  raised  by  many  workload  researchers — the  ability  of  the  pilot  to  modify  his  proce¬ 
dures  In  the  face  of  high  workload  so  as  to  reduce  workload  while  also  successfully  completing  his  tasks. 
When  this  occurs,  measured  performance  on  some  categories  (e.g.,  stick  and  rudder  activity)  deviates  from 
the  "standard"  without  apparent  cost  to  task  achievement. 

Secondary  pilot  performance.  In  Chapter  7,  secondary  pilot  performance  measures  (secondary  tasks) 
are  carried  under  the  major  heading  of  Spare  Mental  Capacity.  That  is  appropriate  In  the  sense  that  spare 
capacity  provides  a  conceptual  umbrella  for  measurement  approaches,  of  which  a  secondary  task  is  one.  To 
quote  from  Chapter  14,  "Spare  mental  capacity  ...  is  the  difference  between  the  total  workload  capacity 
of  the  operator  and  the  capacity  needed  to  perform  the  task.”  Among  the  elements  of  this  cluster  are: 

a.  measurement  of  multi-channel  processing 

b.  measurement  of  switching  attention  among  channels 

c.  Identification  of  conflicts  and  bottlenecks  In  Information  processing 

d.  variations  in  the  overload  point 

The  authors  then  address  In  some  detail  three  approaches:  task  analytic;  secondary  task;  and  occlusion 
procedures.  Task  analytic  measurement  derives  from  systems  engineering ,  using  modeling  in  various  forms, 
manipulating  empirical  data  from  laboratory  and  simulator  studies.  It  adopts  the  Blngle  channel  concept 
of  man  as  an  Information  processor  and  responder,  though  this  review  has  already  highlighted  various  ways 
In  which  the  pilot  may  deviate  from  that  mode.  Therefore,  there  is  always  the  ieed  for  empirical  verifica¬ 
tion  of  task  analytic  studies.  However,  mathematical  modelling  has  two  unique  virtues:  an  operation  (task) 
can  be  examined  parametrically  with  values  for  each  step  ranging  from  artificial  clear  mlnlmums  to  artifi¬ 
cial  clear  maxlmums,  thereby  Identifying  worst  case/best  case  analyses  of  each  step  and  "choke  points;" 
and  a  model  can  generate  tenable  findings  using  fragmentary  data.  These  two  virtues  are  not  unrelated, 
since  each  supports  the  other  In  the  nature  of  the  data  being  manipulated  (artificial  vs.  empirical)  and 
In  the  kinds  of  findings  which  can  be  generated.  The  behavior ists  in  workload  research.  In  contrast  to 
mathematically  oriented  analysts,  prefer  secondary  tasks  to  assess  space  capacity.  "Performance  on  the 
secondary  task  theoretically  decreases  as  the  attentlonal  demands  of  the  primary  task  Increase.  Secon¬ 
dary  task  performance,  then,  becomes  an  Indirect  measure  of  operator  workload"  (Chapter  7).  Methodology 
and  the  design  of  tasks  vary  considerably,  although  the  Sternberg  task  is  widely  recognized  and  frequently 
used.  The  Sternberg  task  is  an  item  recognition  task  with  a  strong  supporting  analytic  model.  Graphic 
representation  of  a  least-squares,  linear  regression  line  is  the  coomon  way  of  presenting  Sternberg  task 
data.  The  underlying  hypothesis  holds  that  the  slope  of  this  line  reflects  the  Information  processing 
aspect  of  the  task  and  the  Intercept  depicts  the  Input/output  aspect  of  the  task.  Secondary  tasks  have 
high  potential  for  becoming  field  qualified  and  even  becoming  cockpit  qualified  when  installation  of  this 
additional  piece  of  hardware  in  a  cockpit  Is  permissible.  Occlusion  is  a  technique  where  visual  Informa¬ 
tion  Is  presented  in  samples  rather  than  continuously,  and  Inferring  workload  from  performance  changes. 

Its  utility  for  aircrew  workload  studies  in  flight  is  questionable,  though  It  may  have  merit  for  labora¬ 
tory  and  simulator  studies. 

Laboratory  task  performance.  The  strategies  available  to  the  laboratory  researcher  on  workload  are 
extensive.  They  range  all  the  way  from  a  simulation  of  a  specific  piloting  task  or  subtask,  through 
multi-task  batteries,  the  uae  of  a  collection  of  conventional  psychomotor  tasks,  to  the  use  of  a  single 
psychosmtor  task.  In  addition,  there  is  the  manipulation  of  test  conditions,  such  as  work/rest  schedules 
or  day  versus  night  or  schedules  which  disrupt  normal  sleep.  Finally,  all  of  the  workload  measures  dis¬ 
cussed  earlier  In  this  report  can  be  used.  How  one  conceptualizes  the  real  world  task  being  studied  and, 
more  Important,  how  one  conceptualizes  the  relationship  between  real-world  task  and  the  tasks  being  used 
In  the  laboratory  are  significant  Issues.  There  are  even  issues  which  should  be  addressed  on  the  choice 
of  the  analysis  technique  and  how  one  chooses  to  explain  variations  in  results  where  more  than  one  measure 
is  taken.  Even  a  high  fidelity  laboratory  simulation  Is  open  to  debate,  since  many  researchers  and  opera¬ 
tional  personnel  challenge  a  laboratory  simulation  on  the  grounds  that  the  hazards  of  flying  cannot  be 
duplicated.  The  choice  of  subjects,  especially  the  kind  usually  available  for  laboratory  studies  and 
frequently  with  a  restricted  age  range  and  no  piloting  experience,  also  poses  problems  In  applying  results 
to  the  operational  world. 
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Major  benefits,  at  least  In  terms  of  face  validity,  derive  from  approaching  the  selection  of  labora¬ 
tory  tasks  in  a  systematic  fashion.  Some  investigators  develop  task  batteries  on  the  basis  of  prior  fac¬ 
tor  analytic  studies,  where  an  array  of  tasks  can  be  analyzed  so  as  to  yield  a  smaller  number  of  factors. 

A  term  in  current  use  is  "universal  operator  behaviors”  (see  Chapter  7  in  AGARDograph  Ho.  246),  which  cap¬ 
ture  real-world  tasks  with  terms  like  "tracking,"  “arithmetic,"  "logic,"  etc.  A  similar  approach  is  to 
develop  multi-task  batteries  where  tasks  are  chosen  for  independence  from  each  other  while  still  reflecting 
real-world  task  elements  with  reasonable  face  validity.  A  US  Navy  approach  involves  creating  a  battery  by 
selecting  tasks  from  well-standardized  published  tests.  In  this  latter  case,  measures  were  selected  from 
intelligence  tests,  cognitive  tests,  factor  analyses  of  tests  of  manual  dexterity,  information  processing 
tests,  tests  of  central  nervous  system  status,  and  so  on.  All  three  of  these  approaches  have  a  double 
goal:  (a)  provide  data  on  the  problem  being  studied,  and  (b)  develop  a  data  base  across  a  series  of 
studies  for  the  purposes  of  norms,  reliability,  and  consensual  validation.  This  last  term  was  in  vogue 
more  than  a  decade  ago  and  still  provides  some  basis  for  asserting  that  a  test  or  battery  or  concept  has 
validity  because  numerous  studies  with  similar  approaches  yielded  similar  results.  There  is  some  merit 
to  this  concept,  though  a  direct  field  validation  with  "real"  subjects  flying  real  missions  is  clearly 
more  impressive.  At  any  rate,  a  single  test,  no  matter  how  good,  is  less  impressive  than  a  well  thought 
out  battery  of  tests  based  on  a  convincing  rationale,  because  the  real-world  task  of  flying  military 
missions  is  Indeed  complex.  What  is  needed  badly  in  the  area  are  standardized  tests  with  convincing  relia¬ 
bility  and  validity  and  a  task  taxonomy  (the  anatomy  of  psychomotor  skills)  which  is  widely  accepted  In  the 
scientific  cosmunlty. 


VI .  SUMMARY 

This  report  has  provided  an  overview  and  evaluation  of  the  measurement  of  pilot  workload  using 
AGARDograph  246,  supplemented  by  AGARDograph  244,  as  primary  sources.  The  broad  measurement  areas  of 
psychophysiology,  subjective  reports,  and  performance  testing  have  been  addressed.  The  concept  of  work¬ 
load  and  its  relation  to  fatigue  and  stress  have  been  discussed.  By  way  of  summary,  we  can  review  the 
following  as  Issues  and/or  needs  for  workload  measurement  technology: 

a.  We  need  a  widely  accepted  definition  of  pilot  workload  which  goes  beyond  the  definition 
implicit  in  the  selection  of  a  specific  measure. 

b.  We  need  strong  bridges  of  reliability  and  validity  for  second  order  measures  such  as  are 
characteristic  with  psychophysiologlc  or  biochemical  measures,  for  example. 

c.  We  need  to  accept  more  readily  the  utility  of  subjective  measures. 

d.  We  need  better  standardization  of  psychomotor  tests  and  more  validity  and  reliability  data 

for  psychomotor  tests  which  constitute  "synthetic"  tasks. 

e.  We  need  field-qualified  and/or  cockpit  qualified  instruments. 

f.  We  need  convincing,  successful  field  studies  which  provide  the  operational  community  with 

applications  which  demonstrably  reduce  pilot  workload. 

g.  We  need  better  models  which  can  cope  with  fragmentary  data  and,  in  addition,  provide  infor¬ 
mation  on  the  operational  utility  of  test  results,  as  well  as  clues  on  where  to  focus  our  workload  research. 


There  are  studies  and  developments  ongoing  in  research  groups  across  the  NATO  comunlty  which  are  promising 
Investigators  are  adopting  the  broad  view  at  an  increasing  rate.  Significant  progress  is  being  made  in 
workload  measurement  technology. 
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14.  Abstract 


Military  aircraft  are  becoming  increasingly  complex,  the  associated  avionics  systems  more 
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The  methods  of  assessing  workload  are  set  forth  in  AGARDograph  AG-246  “Survey  of 
Methods  to  Assess  Workload”  which  was  published  in  August  1979.  This  companion 
document  sets  forth  conclusions  on  workload  measurement  methodology. 
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